Show the code
import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)from palmerpenguins import load_penguins
df = load_penguins()
ggplot(df, aes(x="species")) + geom_bar()
df.head()| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
|---|---|---|---|---|---|---|---|---|
| 0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | male | 2007 |
| 1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | female | 2007 |
| 2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | female | 2007 |
| 3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN | 2007 |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | female | 2007 |
__ PY4DS: CH2 Data Visualization
# Include and execute your code here
penguins = load_penguins()
penguins
penguins.head()| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
|---|---|---|---|---|---|---|---|---|
| 0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | male | 2007 |
| 1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | female | 2007 |
| 2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | female | 2007 |
| 3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN | 2007 |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | female | 2007 |
The tabular data allows us to easily view each variable in a tidy and simple manner. These variables are: species, flipper_length_mm, and body_mass_g.
I’d like to answer for pracite: ## Do penguins with longer flippers weigh more or less than penguins with shorter flippers? Recreate the example charts from PY4DS: CH2 Data Visualization of the textbook. (Hint: copy the chart code from 2.2.3. Creating a Plot, one for each cell below)
ggplot(penguins, aes(x="species")) + geom_bar()
ggplot(data = penguins)(
ggplot(data=penguins, mapping=aes(x="flipper_length_mm", y="body_mass_g"))
+ geom_point()
)This first plot is very simple, it can be difficult to differentiate though because there is no key indicator on what dots represent what species.
(
ggplot(
data=penguins,
mapping=aes(x="flipper_length_mm", y="body_mass_g", color="species"),
)
+ geom_point()
)This one is a lot better and shows the different penguins in codination with color. The red dots are the Adelie, the blue is Gentoo, and the green are the Chinstrap penguins. This allows us to see the body mass in comparrison to flipper length based on species. Looking at the graph, it’s evident that the Gentoo species has a higher body mass and flipper length.
(
ggplot(data=penguins, mapping=aes(x="flipper_length_mm", y="body_mass_g"))
+ geom_point(mapping=aes(color="species"))
+ geom_smooth(method="lm")
)This graph has a lot better qualities but the one that is best is the next one.
(
ggplot(data=penguins, mapping=aes(x="flipper_length_mm", y="body_mass_g"))
+ geom_point(mapping=aes(color="species", shape="species"))
+ geom_smooth(method="lm")
)This one allows us to put different shapes and colors to species which allows us to visiually see them a lot faster. The pink line that goes through is a linear regression smoothing line. Conclusion: The penguins with bigger flippers do have bigger mass.